Communication optimization strategies for distributed deep neural network training: A survey

نویسندگان

چکیده

Recent trends in high-performance computing and deep learning have led to the proliferation of studies on large-scale neural network training. However, frequent communication requirements among computation nodes drastically slows overall training speeds, which causes bottlenecks distributed training, particularly clusters with limited bandwidths. To mitigate drawbacks communications, researchers proposed various optimization strategies. In this paper, we provide a comprehensive survey strategies from both an algorithm viewpoint computer perspective. Algorithm optimizations focus reducing volumes used while accelerating communications between devices. At level, describe how reduce number rounds transmitted bits per round. addition, elucidate overlap communication. discuss effects caused by infrastructures, including logical schemes protocols. Finally, extrapolate potential future challenges new research directions accelerate for

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large Scale Distributed Hessian-Free Optimization for Deep Neural Network

Training deep neural network is a high dimensional and a highly non-convex optimization problem. In this paper, we revisit Hessian-free optimization method for deep networks with negative curvature direction detection. We also develop its distributed variant and demonstrate superior scaling potential to SGD, which allows more efficiently utilizing larger computing resources thus enabling large ...

متن کامل

Exploring Strategies for Training Deep Neural Networks

Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization often appears to get stuck in poor solutions. Hinton et al. recently proposed a greedy layer-wise u...

متن کامل

Regularization and Optimization strategies in Deep Convolutional Neural Network

Convolution Neural Networks, known as ConvNets exceptionally perform well in many complex machine learning tasks. The architecture of ConvNets demands the huge and rich amount of data and involves with a vast number of parameters that leads the learning takes to be computationally expensive, slow convergence towards the global minima, trap in local minima with poor predictions. In some cases, a...

متن کامل

Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization

Training neural network acoustic models with sequencediscriminative criteria, such as state-level minimum Bayes risk (sMBR), been shown to produce large improvements in performance over cross-entropy. However, because they entail the processing of lattices, sequence criteria are much more computationally intensive than cross-entropy. We describe a distributed neural network training algorithm, ...

متن کامل

A conjugate gradient based method for Decision Neural Network training

Decision Neural Network is a new approach for solving multi-objective decision-making problems based on artificial neural networks. Using inaccurate evaluation data, network training has improved and the number of educational data sets has decreased. The available training method is based on the gradient decent method (BP). One of its limitations is related to its convergence speed. Therefore,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Parallel and Distributed Computing

سال: 2021

ISSN: ['1096-0848', '0743-7315']

DOI: https://doi.org/10.1016/j.jpdc.2020.11.005